Language and Speech Processing

نویسندگان

  • Sanne Korzec
  • Bart J. Buter
چکیده

At the end of the 19th century, L. L. Zamenhof proposed Esperanto; it was intended as a global language to be spoken and understood by everyone. The inventor was hoping that a common language could resolve global problems that lead to con ict. This idealistic idea did not reach its full potential, yet there are still scienti c elds pursuing its legacy, though not necessarily from an ideological point of view. The Internet contains billions of web pages, as they come in all kinds of languages, a great deal of information is not available to us. A practical application would be a browser that translates these pages in a preferred language. In the eld of statistical machine translation (SMT), we try to build algorithms to translate from one language to the other by mere statistics taken from large bi-text corpora. When we adopt the SMT approach, we represent all individual or groups of words (cepts) as having a connection to zero, one or many foreign cepts under a probability value. This means that both the alignments and the probabilities need to be extracted from the bi-text. The main problem here is that we need the alignments to estimate the probabilities and the probabilities to estimate the alignments. These kinds of problems can be solved with the EM algorithm, one approach [2, 3] is particularly favored due to its estimation of reasonable models. In this paper, we will present the SMT Aligner (SMTA), which word aligns French-English sentences given word translation probabilities provided by [1] to estimate good alignments under the assumptions of IBM model I. We will then compare results from alignments that use the null word with ones that don't. We will also introduce a heuristic to increase the F1-score results by 4 percent. In section 2 we describe the theory which forms the basis of the SMTA. Section 3 is used to describe the research methodology. In Section 4 we present our results and we conclude section 5 with the discussion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Teaching approaches to Computer Assisted Language Learning

Computers have been used for language teaching ever since the 1960's.Learning a second language is a challenging endeavor, and, for decades now, proponents of computer assisted language learning (CALL) have declared that help is on the horison. We investigate the suitability of deploying speech technology in computer based systems that can be used to teach foreign language skills. In this case,...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

Using functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas

Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...

متن کامل

Music Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children

Introduction: In recent years, music has been employed in many intervention and rehabilitation program to enhance cognitive abilities in patients. Numerous researches show that music therapy can help improving language skills in patients including hearing impaired. In this study, a new method of music training is introduced based on principles of neuroscience and capabilities of Persian languag...

متن کامل

Rehabilitation Approaches for Drug Abuse, Addiction and Pediatric Issues

The current issue of the Iranian Rehabilitation Journal contains original research evaluating the efficacy of addiction rehabilitation an evaluation of a child rehabilitation system for community based research, reading program for children with down syndrome, auditory stream segregation in auditory processing disorder, speech and language disorders, quality of life of adolescents with hearing ...

متن کامل

Using functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas

Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007